DRAM Aware Last-Level-Cache Policies for Multi-core Systems

نویسنده

  • Fazal Hameed
چکیده

x latency DTC in two cycles. In contrast, state-of-the-art DRAM cache always reads the tags from DRAM cache that incurs high tag lookup latencies of up to 41 cycles. In summary, high DRAM cache hit latencies, increased inter-core interference, increased inter-core cache eviction, and the large application footprint of complex applications necessitates efficient policies in order to satisfy the diverse requirements to improve the overall throughput. This thesis addresses how to design DRAM caches to reduce DRAM cache hit latency, DRAM cache miss rate and hardware cost, while taking into account both application and DRAM characteristics by presenting novel DRAM and application aware policies. The proposed policies are evaluated for various applications from SPEC2006 using a cycle accurate multi-core simulator based on SimpleScalar that is modified to incorporate DRAM in the cache hierarchy. The combination of the proposed DRAM-aware and application-aware complementary policies improve the average performance of latency-sensitive applications by 47.1% and 35% for an 8-core system compared to [102] and [77] respectively while requiring 51% less hardware overhead.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DRAM-Aware Last-Level Cache Replacement

The cost of last-level cache misses and evictions depend significantly on three major performance-related characteristics of DRAM-based main memory systems: bank-level parallelism, row buffer locality, and write-caused interference. Bank-level parallelism and row buffer locality introduce different latency costs for the processor to service misses: parallel or serial, fast or slow. Write-caused...

متن کامل

DRAM-Aware Last-Level Cache Writeback: Reducing Write-Caused Interference in Memory Systems

Read and write requests from a processor contend for the main memory data bus. System performance depends heavily on when read requests are serviced since they are required for an application’s forward progress whereas writes do not need to be performed immediately. However, writes eventually have to be written to memory because the storage required to buffer them on-chip is limited. In modern ...

متن کامل

STT-RAM Aware Last-Level-Cache Policies for Simultaneous Energy and Performance Improvement

High capacity Last Level Cache (LLC) architectures have been proposed to mitigate the widening processor-memory speed gap. These LLC architectures have been realized using DRAM or SpinTransfer-Torque Random Access Memory (STT-RAM) memory technologies. It has been shown that STT-RAM LLC provides improved energy efficiency compared to DRAM LLC. However, existing STT-RAM LLC suffers from increased...

متن کامل

Write-Aware Replacement Policies for PCM-Based Systems

The gap between processor and memory speeds is one of the greatest challenges that current designers face in order to develop more powerful computer systems. In addition, the scalability of the Dynamic Random Access Memory (DRAM) technology is very limited nowadays, leading to consider new memory technologies as candidates for the replacement of conventional DRAM. Phase-Change Memory (PCM) is c...

متن کامل

Cache-Aware Virtual Machine Scheduling on Multi-Core Architecture

Facing practical limits to increasing processor frequencies, manufacturers have resorted to multi-core designs in their commercial products. In multi-core implementations, cores in a physical package share the last-level caches to improve inter-core communication. To efficiently exploit this facility, operating systems must employ cache-aware schedulers. Unfortunately, virtualization software, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015